1,242 research outputs found

    Alchemical and structural distribution based representation for improved QML

    Full text link
    We introduce a representation of any atom in any chemical environment for the generation of efficient quantum machine learning (QML) models of common electronic ground-state properties. The representation is based on scaled distribution functions explicitly accounting for elemental and structural degrees of freedom. Resulting QML models afford very favorable learning curves for properties of out-of-sample systems including organic molecules, non-covalently bonded protein side-chains, (H2_2O)40_{40}-clusters, as well as diverse crystals. The elemental components help to lower the learning curves, and, through interpolation across the periodic table, even enable "alchemical extrapolation" to covalent bonding between elements not part of training, as evinced for single, double, and triple bonds among main-group elements

    Water on hexagonal boron nitride from diffusion Monte Carlo

    Get PDF
    Despite a recent flurry of experimental and simulation studies, an accurate estimate of the interaction strength of water molecules with hexagonal boron nitride is lacking. Here we report quantum Monte Carlo results for the adsorption of a water monomer on a periodic hexagonal boron nitride sheet, which yield a water monomer interaction energy of -84 +/- 5 meV. We use the results to evaluate the performance of several widely used density functional theory (DFT) exchange correlation functionals, and find that they all deviate substantially. Differences in interaction energies between different adsorption sites are however better reproduced by DFT

    Exploring water adsorption on isoelectronically doped graphene using alchemical derivatives

    Get PDF
    The design and production of novel 2-dimensional materials has seen great progress in the last decade, prompting further exploration of the chemistry of such materials. Doping and hydrogenating graphene is an experimentally realised method of changing its surface chemistry, but there is still a great deal to be understood on how doping impacts on the adsorption of molecules. Developing this understanding is key to unlocking the potential applications of these materials. High throughput screening methods can provide particularly effective ways to explore vast chemical compositions of materials. Here, alchemical derivatives are used as a method to screen the dissociative adsorption energy of water molecules on various BN doped topologies of hydrogenated graphene. The predictions from alchemical derivatives are assessed by comparison to density functional theory. This screening method is found to predict dissociative adsorption energies that span a range of more than 2 eV, with a mean absolute error <0.1<0.1 eV. In addition, we show that the quality of such predictions can be readily assessed by examination of the Kohn-Sham highest occupied molecular orbital in the initial states. In this way, the root mean square error in the dissociative adsorption energies of water is reduced by almost an order of magnitude (down to ∼0.02\sim0.02 eV) after filtering out poor predictions. The findings point the way towards a reliable use of first order alchemical derivatives for efficient screening procedures

    Operator quantum machine learning: Navigating the chemical space of response properties

    Get PDF
    The identification and use of structure property relationships lies at the heart of the chemical sciences. Quantum mechanics forms the basis for the unbiased virtual exploration of chemical compound space (CCS), imposing substantial compute needs if chemical accuracy is to be reached. In order to accelerate predictions of quantum properties without compromising accuracy, our lab has been developing quantum machine learning (QML) based models which can be applied throughout CCS. Here, we briefly explain, review, and discuss the recently introduced operator formalism which substantially improves the data efficiency for QML models of common response properties

    Geometry Relaxation and Transition State Search throughout Chemical Compound Space with Quantum Machine Learning

    Full text link
    We use energies and forces predicted within response operator based quantum machine learning (OQML) to perform geometry optimization and transition state search calculations with legacy optimizers. For randomly sampled initial coordinates of small organic query molecules we report systematic improvement of equilibrium and transition state geometry output as training set sizes increase. Out-of-sample SN_\mathrm{N}2 reactant complexes and transition state geometries have been predicted using the LBFGS and the QST2 algorithm with an RMSD of 0.16 and 0.4 \r{A} -- after training on up to 200 reactant complexes relaxations and transition state search trajectories from the QMrxn20 data-set, respectively. For geometry optimizations, we have also considered relaxation paths up to 5'500 constitutional isomers with sum formula C7_7H10_{10}O2_2 from the QM9-database. Using the resulting OQML models with an LBFGS optimizer reproduces the minimum geometry with an RMSD of 0.14~\r{A}. For converged equilibrium and transition state geometries subsequent vibrational normal mode frequency analysis indicates deviation from MP2 reference results by on average 14 and 26\,cm−1^{-1}, respectively. While the numerical cost for OQML predictions is negligible in comparison to DFT or MP2, the number of steps until convergence is typically larger in either case. The success rate for reaching convergence, however, improves systematically with training set size, underscoring OQML's potential for universal applicability

    Constant Size Molecular Descriptors For Use With Machine Learning

    Full text link
    A set of molecular descriptors whose length is independent of molecular size is developed for machine learning models that target thermodynamic and electronic properties of molecules. These features are evaluated by monitoring performance of kernel ridge regression models on well-studied data sets of small organic molecules. The features include connectivity counts, which require only the bonding pattern of the molecule, and encoded distances, which summarize distances between both bonded and non-bonded atoms and so require the full molecular geometry. In addition to having constant size, these features summarize information regarding the local environment of atoms and bonds, such that models can take advantage of similarities resulting from the presence of similar chemical fragments across molecules. Combining these two types of features leads to models whose performance is comparable to or better than the current state of the art. The features introduced here have the advantage of leading to models that may be trained on smaller molecules and then used successfully on larger molecules.Comment: 18 pages, 5 figure

    FCHL revisited:Faster and more accurate quantum machine learning

    Get PDF
    We introduce the FCHL19 representation for atomic environments in molecules or condensed-phase systems. Machine learning models based on FCHL19 are able to yield predictions of atomic forces and energies of query compounds with chemical accuracy on the scale of milliseconds. FCHL19 is a revision of our previous work [Faber et al. 2018] where the representation is discretized and the individual features are rigorously optimized using Monte Carlo optimization. Combined with a Gaussian kernel function that incorporates elemental screening, chemical accuracy is reached for energy learning on the QM7b and QM9 datasets after training for minutes and hours, respectively. The model also shows good performance for non-bonded interactions in the condensed phase for a set of water clusters with an MAE binding energy error of less than 0.1 kcal/mol/molecule after training on 3,200 samples. For force learning on the MD17 dataset, our optimized model similarly displays state-of-the-art accuracy with a regressor based on Gaussian process regression. When the revised FCHL19 representation is combined with the operator quantum machine learning regressor, forces and energies can be predicted in only a few milliseconds per atom. The model presented herein is fast and lightweight enough for use in general chemistry problems as well as molecular dynamics simulations

    The Privatization Origins of Political Corporations: Evidence from the Pinochet Regime

    Get PDF
    We show that the sale of state owned firms in dictatorships can help political corporations to emerge and persist over time. Using new data, we characterize Pinochet’s privatizations in Chile and find that some firms were sold underpriced to politically connected buyers. These newly private firms benefited financially from the Pinochet regime. Once democracy arrived, they formed connections with the new government, financed political campaigns, and were more likely to appear in the Panama Papers. These findings reveal how dictatorships can influence young democracies using privatization reforms

    ML Models of Vibrating H2_2CO: Comparing Reproducing Kernels, FCHL and PhysNet

    Full text link
    Machine Learning (ML) has become a promising tool for improving the quality of atomistic simulations. Using formaldehyde as a benchmark system for intramolecular interactions, a comparative assessment of ML models based on state-of-the-art variants of deep neural networks (NN), reproducing kernel Hilbert space (RKHS+F), and kernel ridge regression (KRR) is presented. Learning curves for energies and atomic forces indicate rapid convergence towards excellent predictions for B3LYP, MP2, and CCSD(T)-F12 reference results for modestly sized (in the hundreds) training sets. Typically, learning curve off-sets decay as one goes from NN (PhysNet) to RKHS+F to KRR (FCHL). Conversely, the predictive power for extrapolation of energies towards new geometries increases in the same order with RKHS+F and FCHL performing almost equally. For harmonic vibrational frequencies, the picture is less clear, with PhysNet and FCHL yielding respectively flat learning at ∼\sim 1 and ∼\sim 0.2 cm−1^{-1} no matter which reference method, while RKHS+F models level off for B3LYP, and exhibit continued improvements for MP2 and CCSD(T)-F12. Finite-temperature molecular dynamics (MD) simulations with the same initial conditions yield indistinguishable infrared spectra with good performance compared with experiment except for the high-frequency modes involving hydrogen stretch motion which is a known limitation of MD for vibrational spectroscopy. For sufficiently large training set sizes all three models can detect insufficient convergence (``noise'') of the reference electronic structure calculations in that the learning curves level off. Transfer learning (TL) from B3LYP to CCSD(T)-F12 with PhysNet indicates that additional improvements in data efficiency can be achieved
    • …
    corecore